Integrating HMM-Based Speech Recognition With Direct Manipulation In A Multimodal Korean Natural Language Interface

نویسندگان

Gary Geunbae Lee

Jong-Hyeok Lee

Sangeok Kim

چکیده

This paper presents a HMM-based speech recognition engine and its integration into direct manipulation interfaces for Korean document editor. Speech recognition can reduce typical tedious and repetitive actions which are inevitable in standard GUIs (graphic user interfaces). Our system consists of general speech recognition engine called ABrain 1 and speech commandable document editor called SHE 2. ABrain is a phoneme-based speech recognition engine which shows up to 97% of discrete command recognition rate. SHE is a EuroBridge widget-based document editor that supports speech commands as well as direct manipulation interfaces.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean

A new tightly coupled speech and natural language integration model is presented for a TDNN-based continuous possibly large vocabulary speech recognition system for Korean. Unlike popular n-best techniques developed for integrating mainly HMM-based speech recognition and natural language processing in a word level, which is obviously inadequate for morphologically complex agglutinative language...

متن کامل

Eucalyptus: Integrating Natural Language Input with a Graphical User Interface

This report describes Eucalyptus, a natural language (NL) interface that has been integrated with the graphical user interface of the KOALAS Test Planning Tool, a simulated Naval air combat command system. The multimodal, multimedia interface handles both imperative commands and database queries (either typed or spoken into a microphone) while still allowing full use of the original graphical i...

متن کامل

Speech Interfaces to Virtual Reality

In this paper, we consider how speech interfaces can be combined with a direct manipulation interface to virtual reality. We outline the beneets of adding a speech interface, the requirements it imposes on speech recognition, language processing and interaction design. We describe the multimodal DIVERSE system which provides a speech interface to virtual worlds modelled in DIVE. This system can...

متن کامل

Speech and Language Processing for Multimodal Human-Computer Interaction

In this paper, we describe our recent work at Microsoft Research, in the project codenamed Dr. Who, aimed at the development of enabling technologies for speech-centric multimodal human-computer interaction. In particular, we present in detail MiPad as the first Dr. Who's application that addresses specifically the mobile user interaction scenario. MiPad is a wireless mobile PDA prototype that ...

متن کامل

Speech Input in Multimodal Environments: A Proposal to Study the Effects of Reference Visibility, Reference Number, and Task Integration

A model of complimentary behavior has been suggested based on arguments that direct manipulation and speech recognition interfaces have complimentary strengths and weaknesses. Specifically, anecdotal arguments have been given that direct manipulation interfaces are best used for specifying simple actions when all references are visible and the number of references is limited, while speech recog...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره cmp-lg/9611005 شماره

صفحات -

تاریخ انتشار 1996

Integrating HMM-Based Speech Recognition With Direct Manipulation In A Multimodal Korean Natural Language Interface

نویسندگان

چکیده

منابع مشابه

Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean

Eucalyptus: Integrating Natural Language Input with a Graphical User Interface

Speech Interfaces to Virtual Reality

Speech and Language Processing for Multimodal Human-Computer Interaction

Speech Input in Multimodal Environments: A Proposal to Study the Effects of Reference Visibility, Reference Number, and Task Integration

عنوان ژورنال:

اشتراک گذاری